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Abstract 


We conducted an evaluation study of the Middle School Algebra Readiness Initiative, 
a middle school mathematics intervention that was implemented in two West Virginia 
school districts during the 2011-2012 school year. In participating middle schools, the Car- 
negie Learning MATHia® software intervention and accompanying classroom curriculum 
were used as a total replacement for the districts’ alternative mathematics curriculum for 
Grades 6, 7, and 8. A cohort of teachers was trained by Carnegie Learning in mathematical 
content and pedagogy as well as in the proper implementation of the software and classroom 
curriculum materials. 


Our evaluation tested five hypotheses. Our first was related to the impact of the initi- 
ative on teacher-level outcomes, specifically teachers’ content and pedagogical knowledge in 
the areas of patterns, functions, and algebra. This hypothesis was tested by using a pre- 
test/posttest assessment of teacher knowledge. We used the research-validated, Learning 
Mathematics for Teaching (LMT) assessment. Our statistical analysis of teacher pre- 
test/posttest differences revealed that, for the 20 teachers who completed both a pretest and 
posttest, there was only a marginal gain. This gain was not statistically significant. As such, 
we rejected our first hypothesis. 


The remaining four study hypotheses tested the impact of the initiative on students’ 
mathematical achievement and year-to-year mathematics gains as measured by the Grade 6, 
7, or 8 mathematics subtest of the West Virginia Educational Standards Test 2 (WESTEST 
2). We used propensity score matching (PSM) to match students in a variety of implementa- 
tion scenarios to select a comparison group of students. The comparison groups for our in- 
vestigations of Hypotheses 2-4, included students who used the MATHia 
software/curriculum at various levels of implementation, matched to their grade-level peers 
who used some other curriculum during the 2011-2012 school year. For Hypothesis 5, we 
compared students who used the MATHia program for at least 1 hour per week—meeting the 
vendor’s definition of adequate use—to a comparison group of students who used the pro- 
gram for less time. In all cases, we rigorously matched the two groups of students using a 
variety of covariates including sex, free/reduced-price lunch eligibility, special education eli- 
gibility, grade, and prior academic achievement in both mathematics and reading/language 
arts. 


We then conducted two student-level analyses. First, we examined mean differences 
in students’ standardized test scores and mathematics gains, determining if the treatment or 
comparison group scores differed by a statistically significant margin. Second, we used line- 
ar regression to determine, after controlling for the aforementioned covariates, what level of 
impact the treatment had on student achievement and gains. We found in most cases that 
students who were in the treatment group underperformed when compared with their 
grade-level peers who used an alternate curriculum. With a few exceptions, the differences 
were statistically significant. However, the results of the linear regressions illustrated that, 
after controlling for important covariates, the negative relationship among treatment and 
student achievement/gains was relatively small, but still statistically significant. 
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Several limitations impair our ability to make conclusions based on these results. 
Most critically, we had very little information about the degree to which the teachers and 
students implemented the intervention components with fidelity. We know very little about 
the quality or content of the training provided by Carnegie Learning. We experienced con- 
siderable attrition among educators in the period between the pretest and posttest LMT ad- 
ministration. Only 55% of teachers completed both assessments. We also received very 
different numbers of students from Carnegie Learning and the school districts when we re- 
quested this information for our analyses. Finally, we found that very few students met the 
implementation criteria recommended by Carnegie Learning. This finding, in particular, 
points to a potential lack of fidelity in implementation, which makes us very reluctant to al- 
low this evaluation to stand as a fair trial of the efficacy of the MATHia software/curriculum. 
In fact, we recommend strongly against using our report in this manner. It should be seen as 
an evaluation of an entire initiative rather than any curriculum or software program alone. 


In light of these and other limitations described in this report, we make only two rec- 
ommendations. First, we suggest future program implementations of this type take substan- 
tial measures to collect critical qualitative implementation data so that the results of 
quantitative analyses can be more readily interpreted. This can be accomplished, among 
other strategies, by devoting greater resources to the program evaluation component of such 
projects. Second, in districts where similar programs are currently underway or in the plan- 
ning stages, we recommend continuous monitoring and technical assistance to ensure that 
the program components are delivered as intended. Doing so may help prevent a potential 
negative impacts on student outcomes. 
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Introduction 


This study evaluates the impact of the Middle School Algebra Readiness Initiative 
(MSARI). Our analysis focuses on the extent to which the initiative increased teachers’ 
mathematics content and pedagogical knowledge and students’ achievement and growth on 
the mathematics subtest of the West Virginia Educational Standards Test 2 (WESTEST 2). 
The WESTEST 2 is the state’s summative assessment, required under the federal No Child 
Left Behind Act. The mathematics subtest of WESTEST 2 is administered to all students in 
Grades 3-11 in the spring of each school year. This measure was chosen because of its avail- 
ability and its focus on concepts aligned to the proposed intervention curriculum. 


The MSARI was implemented in two West Virginia school districts during the 2011- 
2012 school year. These districts committed to using Carnegie Learning’s MATHia® soft- 
ware and accompanying classroom curriculum as a total replacement for the standard math- 
ematics curriculum for Grades 6, 7, and 8. Teachers in these schools were trained by 
Carnegie Learning in the use of the MATHia program via a series of mathematics teacher 
academies. We hypothesized that teachers who underwent this training would exhibit in- 
creased mathematical and pedagogical knowledge and that students who used MATHia dur- 
ing the 2011-2012 school year would achieve greater mathematics achievement and gains 
when compared with a matched sample of students who either used an alternate mathemat- 
ics curriculum or who did not use MATHia with fidelity. The following hypotheses were test- 
ed: 


H1 Teachers who participate in training provided as part of the MSARI will exhibit 
significantly greater posttest scores on the Learning Mathematics for Teaching 
(LMT) patterns, functions, and algebra assessment. 


H2 Students who use MATHia software during the 2011-2012 school year, regardless 
of their level of exposure, will score significantly higher on the WESTEST 2 math- 
ematics subtest than students who do not use the software. 


H3 Students who are continuously enrolled in a classroom where MATHia is being 
used for at least 210 days during the 2011-2012 school year will score significantly 
higher on the WESTEST 2 mathematics subtest than students who do not use the 
software. 


H4 Students who use MATHia software during the 2011-2012 school year for the rec- 
ommended minimum of at least 1 hour per week will score significantly higher on 
the WESTEST 2 mathematics subtest than students who do not use the software. 


H5 Students who use the MATHia software during the 2011-2012 school year for the 
recommended minimum of at least 1 hour per week will score significantly higher 
than students who use MATHia for less than the recommended minimum of at 
least 1 hour per week. 


Methods 


Participant Characteristics 


Teachers in this study attended the training provided by Carnegie Learning and also 
took the Learning Mathematics for Teaching (LMT) assessment. Students in the study used 
Carnegie Learning’s MATHia software and curriculum as their mathematics curriculum dur- 
ing the 2011-2012 school year; were in Grades 6, 7, and 8; and were included in the state’s 
2010-2011 and 2011-2012 WESTEST 2 assessment data files. These students were com- 
pared to a matched sample of students from the population of all Grade 6, Grade 7, and 
Grade 8 students who were known not to have used the MATHia software program as their 
mathematics curriculum during the 2011-2012 school year. 


Sampling Procedures 


Teachers 


We used all available records for those teachers who completed the LMT assessment 
at the conclusion of two mathematics academies, one at the outset of the 2011-2012 school 
year and the other at the end. For our pretest and posttest analyses, we could only include 
those teachers who had a matched pretest and posttest assessment, limiting our analysis of 
teacher outcomes to the 20 teachers who met this condition. 


Students 


We received an initial spreadsheet containing 2,265 unique student records along 
with software usage statistics from Carnegie Learning. We then requested a list of students 
from the two county school systems, which we used to cross reference and identify only 
those students who persisted in classrooms using MATHia for the majority of the academic 
year. To accomplish this, our county contacts asked teachers in each of the middle schools to 
verify on their currently active course rosters students who were enrolled in classrooms 
where MATHia was being utilized. The lists provided by the counties included 1,561 unique 
student names representing 70 classrooms and 22 teachers’. 


Using the 1,561 students provided by the counties, we then queried the software us- 
age statistics from the data file provided by Carnegie Learning. The query returned 1,605 
records because some students were enrolled in multiple courses where MATHia was being 
utilized. After merging those valid duplicate cases and deleting all remaining duplicates, 
1,535 unique student records remained. As a final step prior to matching, we then queried 
assessment and demographic data for these students from WVEIS. We required 2 years of 
assessment data as well as a full set of covariate demographic variables in order to conduct 


1 Tt is unclear why there was such a large discrepancy in the numbers of students reported by 
Carnegie Learning and by the counties. 


Methods 


the matching and final analyses. Therefore, any student for whom we could not locate this 
information was removed from the sample. Our final sample included 1,276 students or 82% 
of those records provided by the counties. These students were then matched using the pop- 
ulation of remaining Grade 6, Grade 7, and Grade 8 students (approximately 60,000 stu- 
dents). 


We used propensity score matching (PSM) to select a set of matched comparison 
groups to test each hypothesis. PSM is a methodology that uses a logistic regression model to 
match samples based on a single score that is based on a variety of observed covariates. 


Matching procedures for student outcome analyses 


We created a binary indicator for whether a student did or did not participate in the 
MSARI (hereafter referred to as treatment or comparison students, respectively). We then 
used propensity scores to match each treatment student to a suitable comparison student. 
The propensity score is the conditional probability of being assigned to the treatment group 
given a vector of observed covariates. The goal of PSM is to model equivalent selection bias 
in both groups, thus exercising some degree of control over the impact of the observed co- 
variates on the outcome variable of interest. 


In this study, we sought primarily to control for prior academic achievement in both 
reading/language arts and mathematics, but specified up to 7 total covariates in the propen- 
sity score models including, (1) 2010-2011 WESTEST 2 mathematics achievement, (2) 
2010-2011 WESTEST 2 reading/language arts achievement, (3) sex, (4) race, (5) 
free/reduced price lunch eligibility, (6) special education eligibility, and (7) grade level. 
Thus, the propensity score we generated was the predicted probability of being assigned to 
the treatment condition obtained from a binary logistic regression including the listed co- 
variates as predictors (Rosenbaum & Rubin, 1983). Once propensity scores were calculated, 
we used nearest neighbor matching without replacement in the R statistical analysis soft- 
ware program to select comparison group members. Table 1 provides a description of each 
sampling frame, how the binary indicator of treatment/comparison was defined, and which 
hypothesis the sampling frame was used to test. 


Once matching was complete, we examined balance statistics for the samples to en- 
sure the matching algorithm resulted in samples that were comparable on the measured co- 
variates. Tables Ai—A8 in Appendix A (page 31) illustrate, for each of the eight sampling 
frames, the pre-/post-matching means for each covariate and the percentage of improve- 
ment in balance for each covariate after matching. Note that, because we did not discard 
treatment units, the treatment post mean is identical to the treatment pre mean; this is indi- 
cated with an asterisk (*). When matching is successful, the post mean difference should be 
as close as possible to zero indicating the groups do not differ on the observed covariate. For 
each sampling frame, we observed a remarkable improvement in the balance post-matching. 
Furthermore, we verified using chi-square analyses, that for all sampling frames the covari- 
ate distributions were not statistically significantly different among the two groups. We 


4 | The Middle School Algebra Readiness Initiative 


Methods 


found that after matching, with one exception2, there were no statistically significant differ- 
ences in these covariates at baseline. As a result, we were very confident going into our anal- 
yses of student achievement and gains. 


Table 1. 


Description of Sampling Frames Used to Test Study Hypotheses 


Sampling 
frame 


Description 


Hypothesis 
tested 


SF16, 
SF17, and 
SF18 


SF26, 
SF27, and 
SF28 


SF3 


SF4 


A set of data frames containing all students who were enrolled in Grade 6 (SF16), 
Grade 7 (SF17), and Grade 8 (SF18) in West Virginia during the 2011-2012 school 
year, including a binary indicator for whether or not the student was in the 
treatment group. Treatment group students were identified by virtue of their 
having been located in the rosters provided by the participating school district 
and Carnegie Learning. All treatment group students were included in these 
sampling frames, regardless of the number of hours/sessions in the software 
program. 


A set of data frames containing all students who were enrolled in Grade 6 (SF26), 
Grade 7 (SF27), and Grade 8 (SF28) in West Virginia during the 2011-2012 school 
year, including a binary indicator for whether or not the student was in the 
treatment group. Treatment group students were identified by virtue of their 
having been located in the rosters provided by the participating school district 
and Carnegie Learning. Treatment group students with less than 210 days of 
continuous enrollment from first to last program session were removed from 
these sampling frames. 


A data frame containing all students who were enrolled in Grade 6, Grade 7, and 
Grade 8 in West Virginia during the 2011-2012 school year, including a binary 
indicator for whether or not the student was in the treatment group. Treatment 
group students were identified by virtue of their having been located in the 
rosters provided by the participating school district and Carnegie Learning. 
Treatment group students with less than 1 hour per week of program use 
according to usage statistics provided by Carnegie Learning were removed from 
the sampling frame. 


A data frame containing ONLY treatment group students who were enrolled in 
Grade 6, Grade 7, and Grade 8 in West Virginia during the 2011-2012 school 
year, including a binary indicator for whether or not the student exhibited at least 
1 hour of program use according to usage statistics provided by Carnegie 
Learning. Treatment group students with less than the recommended 1 hour per 
week of program use according to usage statistics provided by Carnegie Learning 
were coded as the comparison group for this sampling frame. 


H2 


H3 


H4 


H5 


2 Special education eligibility was not equally distributed across groups for SF18. The treat- 
ment group included 11.2% while the control group included 7.2%. This difference was statistically 
significant X2 (1, N= 750) = 4.49, p = .03. 
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Sample Size, Power, and Precision 


Teachers 


As mentioned above, due to attrition between the pretest and posttest administration 
of the LMT, our sample size for teacher-level outcomes was only 20. Thus, we did not have 
adequate power to detect small or moderate effects in this study. It is possible that our fail- 
ure to detect statistically significant differences in this study was due to this issue. We dis- 
cuss this further in the results and limitations sections of this report. 


Students 


Sample sizes varied within each of the aforementioned sampling frames. Tables B1— 
B4 in Appendix B (page 37) provide an overview of the final sample sizes for each sampling 
frame by hypothesis tested. These tables also include an investigation of whether or not we 
had enough power to detect moderate effects of the magnitude observed in this study (d = 
.30). Notably, the final sample sizes for this study were adequate to detect these effects with 
95% confidence only for Hypothesis 2. However, this is somewhat of a moot point given that 
we did observe at least some statistically significant differences for Hypotheses 2—5. This 
information is provided here only to illuminate the fact that our failure to detect significant 
differences in some cases (H2, Grade 6, H3, Grade 7, and H5) may have been due to low 
sample sizes. 


Measures and Covariates 


Independent variables 


Independent variables are those that serve as the predictors of some outcome varia- 
ble (the outcome is often called a dependent variable). In this study, our independent varia- 
bles differed by level of analysis. 


Teacher-level analysis 


The independent variable in our teacher-level analysis was time. Our analysis exam- 
ined the level of change that occurred in teachers’ LMT assessment scores between the ad- 
ministration of the pretest and the posttest. That is, we expected that, over time, the average 
LMT assessment score would increase by virtue of teachers’ accumulation of content and 
pedagogical knowledge as a result of the training they received from Carnegie Learning. 


Student-level analyses 


In this study, the independent variables for student-level analyses included various 
levels of exposure to the MATHia curriculum and software intervention in place of the tradi- 
tional curriculum. Carnegie Learning provided us with the following information about each 
student’s use of MATHia during the 2011-2012 school year: 


1. Date of first session 
2. Date of last session 
3. Total number of seconds of MATHia use between first and last session 
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In consultation with Carnegie Learning, we calculated two metrics that would serve as inde- 
pendent variables in this study: (a) number of hours of MATHia use per week and (b) total 
number of days enrolled in a MATHia classroom. Carnegie Learning recommended hours 
per week as one potential indicator of implementation fidelity, with the following levels: 


1. Low fidelity—less than 1 hour of MATHia use/week 
2. Moderate fidelity—at least 1 hour of MATHia use/week 
3. High fidelity—at least 1.5 hours of MATHia use/week 


We calculated hours/week using the steps detailed in Table 2. 


Table 2. Procedure for Calculating Hours/Week Fidelity Measure 
Step Formula 
Determine the total number of calendar days each Last session date—First session date 


student was an active participant in a classroom 
implementing MATHia. 


Convert the total number of calendar days to the Session days + 7 
total number of calendar weeks. 


Determine the total minutes of actual MATHia use Total seconds of use + 60 
between the start and end date for each student. 


Determine the total minutes of actual MATHia use Total minutes of use + Total weeks of exposure 
per calendar week available for the student. 


Determine the total hours of actual MATHia use per Total minutes actual MATHia use + 60 
calendar week available for the student. 


We then used the criteria illustrated in Table 3 to create four fidelity categories based 
on the recommendations of Carnegie Learning—three mutually exclusive groups and a 
fourth category representing both adequate and high implementers. 


Table 3. Fidelity Categories Developed for This Study 


Fidelity category Cut points 

Low 0.00 - .9999 hours/week 
Adequate 1.00—1.499 hours/week 
High 1.50 and up hours/week 
Adequate or high >1.00 


Table 4 presents the number of students that met each of the aforementioned fidelity 
conditions by grade level. It should be noted that these data were summarized prior to the 
implementation of the PSM algorithm. Some minor attrition did occur during matching. 
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Table 4. Fidelity of Implementation for the Sample Used in this Study 


# of students #ofstudents #ofstudents #of students 


low fidelity adequate high fidelity adequate or 

Average range _ fidelity range range high fidelity 

Grade hours/week Range (0.00 - .9999) (1.00-1.499) (1.50 and up) (>.999) 
6 64 .04—2.82 388 (88.6%) 35 (8.0%) 15 (3.4%) 50 (11%) 

7 57 .04-2.29 448 (93.3%) 27 (5.6%) 5 (1.0%) 32 (6.7%) 

8 65 .05-2.09 321 (83.8%) 43 (11.2%) 19 (5.0%) 62 (16.2%) 


Upon examining these data, it immediately became clear that using Carnegie Learn- 
ing’s recommended criteria, very few students actually reached the levels of implementation 
that would be considered adequate or high. 


Covariates 


There were no covariates included in teacher-level analyses. For student-level anal- 
yses, we matched treatment and comparison cases on 7 covariates using PSM. We also used 
these covariates as predictors in a series of linear regression models. Each of the covariates is 
described below. 


Prior reading/language arts achievement 


For Hypotheses 2 and 3, which employed individual grade-level matching and anal- 
yses, we used students’ standardized 2010-2011 WESTEST 2 reading/language arts (RLA) 
scores as a measure of their RLA ability prior to the 2011-2012 school year. For hypotheses 
4 and 5, which aggregated students across grade levels, we used students’ WESTEST 2 RLA 
performance levels for the 2010-2011 school year. This covariate was included in the match- 
ing model to ensure that the treatment and comparison groups comprised students’ with 
similar RLA skills prior to the intervention. 


Prior mathematics achievement 


For Hypotheses 2 and 3, which employed individual grade-level matching and main 
analyses, we used students’ standardized 2010-2011 WESTEST 2 mathematics scores as a 
measure of their mathematical ability prior to the 2011-2012 school year. For Hypotheses 4 
and 5, which aggregated students across grade levels, we used students’ WESTEST 2 math- 
ematics performance levels for the 2010-2011 school year. This covariate was arguably the 
most important variable included in the matching model because it ensured that the treat- 
ment and comparison groups comprised students’ with similar mathematics skills prior to 
the intervention. Had we not accounted for this variable it may have imparted significant 
bias in our analysis of 2011-2012 mathematics achievement and gains. The correlation be- 
tween students’ prior and current mathematics achievement is known to be statistically sig- 
nificant and of great magnitude. 


Sex 


Student biological sex is known to be associated with academic achievement such 
that male students are often significantly lower performing than their female peers in both 
mathematics and reading/language arts. Thus, it was included in all matching models. 
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Race 


Student race was operationalized as a binary indicator denoting whether or not stu- 
dents were White. Caucasian students represent approximately 92% of all students in West 
Virginia. 

Free and reduced-price lunch eligibility 


Students’ socioeconomic status was operationalized using a proxy measure, free and 
reduced-price lunch eligibility. This indicator was binary, indicating whether or not the stu- 
dent was eligible. This variable is known to possess a negative and statistically significant 
relationship with student achievement. 


Special education eligibility 


Special education eligibility was operationalized as a binary indicator, which indicat- 
ed whether or not a student had an individualized education program (IEP). Special educa- 
tion eligibility is known to possess a negative and statistically significant relationship with 
student achievement. 


Grade 


Grade level was controlled for in Hypotheses 2 and 3 by conducting the PSM match- 
ing within each grade-level band. That is, there was no variability in grade level for these 
analyses; students were only matched to other students in the same grade level. With respect 
to Hypotheses 4 and 5, grade level was operationalized as three binary indicators. Each vari- 
able indicated whether or not the student was in Grade 6, Grade 7, or Grade 8 during the 
implementation year. 


Dependent variables 
Teacher content and pedagogical knowledge 


Teachers’ gains in content and pedagogical knowledge were measured in this study 
via pretest and posttest administration of the Learning Mathematics for Teaching (LMT) 
assessment (Hall, Schilling, & Ball, 2004). The LMT is a teacher assessment that includes a 
battery of diverse assessments appropriate to measure mastery of multiple mathematical 
concepts at various programmatic levels. The measures have been extensively validated via 
multiple research studies and were developed with ongoing support from the National Sci- 
ence Foundation.3 


For this study, we selected the 2007 revision of the Middle School Patterns, Func- 
tions, and Algebra subtest (PFA). This subtest consists of two equated forms, each including 
33 items. The items assess the extent to which teachers have the ability to solve mathematics 
problems of the types typically assigned to their students and how well they are able to eval- 
uate students’ knowledge of mathematics. The results of this assessment are normed based 
on a large and geographically representative sample of middle school mathematics teachers. 


3 For more information about the LMT project, readers are referred to 
http://sitemaker.umich.edu/Imt/home. 
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Raw scores are converted to standardized scores and gains across multiple forms can be ana- 
lyzed to determine statistical significance. 


We administered the LMT to participating teachers prior to the first mathematics 
content academy (pretest) and at the conclusion of the final content academy of the year 
(posttest). During the administration of the pretest, we assigned both of the equated forms 
of the assessment in a staggered fashion such that the first teacher was assigned Form A, the 
second Form B, the third Form A, and so forth. Each teacher was then assigned the alternate 
form during the posttest. For example, if a teacher was assigned Form A at pretest, he or she 
was assigned Form B at posttest. Because both forms were equated, this allowed us to ana- 
lyze gains on the assessment from pretest to posttest without worrying about test-retest bias. 


Student mathematics achievement and gains 


We assessed the effect of the intervention on both mathematics achievement and 
mathematics gains. Math achievement was operationalized as students’ standardized math 
assessment scores from the 2011-2012 administration of the WESTEST 2—the assessment 
administered at the end of the school year during which the intervention took place. Scores 
were standardized within each grade level so that the state mean score for each grade was 
zero and the standard deviation was 1. This allowed for easy interpretation of scores (e.g., a 
score of .25 is the equivalent of one quarter standard deviation above the state mean) and 
also for valid aggregation of assessment results across grade levels to increase effective sam- 
ple sizes for some tests. Readers should keep in mind that standardized test scores indicate a 
student’s relative position within the distribution of her/his grade-level peers. Conversely, 
students’ scale scores are relatively nebulous quantities that have little interpretive value ex- 
cept as they relate to a cut score that expresses a policy expectation (e.g., proficiency). 


Math gains were operationalized as the difference in students’ 2011-2012 and 2010- 
2011 standardized math assessment scores. That is, for each student, we subtracted his or 
her 2010-2011 standardized score from his or her 2011-2012 standardized score. For exam- 
ple, if a student exhibited a 2011-2012 score of 1.0 and a 2010-2011 score of .70, her math 
gain score would be 1.0 minus .70 or .30. Positive scores represent increases in relative 
standing from one year to the next while negative gain scores represent regression in stand- 
ing from one year to the next. 


Importantly, regression in standardized scores may not necessarily correspond to 
lower scale scores. That is, a student’s actual test score may increase from one year to the 
next while their standardized score decreases. 


Research Design 
Teacher-level analyses 


We used a dependent samples paired t test to determine if the average difference in 
pretest and posttest scores for teachers was statistically different from zero. If this were 
found to be true, we would accept our hypothesis that participation in the MSARI led to in- 
creased content and pedagogical knowledge. In the results section of this report we present 
the results of this analysis and also compare the average pretest and posttest scores using 
descriptive statistics. 
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Student-level analyses 


We tested Hypotheses 2 to 5 first by conducting a series of independent samples t 
tests. These simple tests were used to identify the presence or absence of statistically signifi- 
cant differences in math achievement and gains among students in the treatment and com- 
parison groups. The t tests illustrate when the two groups differed and descriptive statistics 
illustrated the amount and direction of those differences. We posited that, if the t tests re- 
turned significant results and those results were in the predicted direction, we could accept 
our study hypotheses that students who used the MATHia software and curriculum exhibit- 
ed higher math achievement and gains than students who used the alternative curriculum. 


The t tests are useful to illustrate where statistically significant differences exist, but 
they are not sufficient to accurately estimate the impact of the treatment when accounting 
for other important covariates that have an impact on the outcome such as prior academic 
achievement. To address this, we also employed a series of linear regression models. These 
models allowed us to estimate the proportion of variance accounted for in math achieve- 
ment/gains by each covariate including a binary indicator of each students’ status as either a 
treatment or comparison group member. We conducted each linear regression in two se- 
quential blocks. During the first block, we simultaneously entered all of the covariates used 
in the PSM models as predictors of the dependent variable under examination. This first 
model allowed us to determine our ability to predict the dependent variable without the 
treatment variable having been accounted for. Our second model included the same covari- 
ates, but added the treatment variable. Comparing the output of both models allowed us to 
calculate the unique contribution of the treatment to students’ math achievement/gains after 
accounting for the impact of the measured covariates. 


Table 5 and Table 6 provide an overview of the general structure of the models we 
used. The reader will notice that the models we used to test H4 and H5 were slightly differ- 
ent. This is because, in these models, we had to account for grade level due to aggregation 
(see above). These models also differed in that we accounted for prior academic achievement 
using students’ prior performance levels rather than their standardized assessment scores. 


Table 5. Overview of Linear Regression Model Structures for H2 and H3 
Model Structure 
1 2011-2012 math achievement/gain = sex + free and reduced price lunch eligibility + race + 
special education eligibility + 2010-2011 math achievement + 2010-2011 RLA achievement 
2 2011-2012 math achievement/gain = sex + free and reduced price lunch eligibility + race + 
special education eligibility + 2010-2011 math achievement + 2010-2011 RLA achievement + 
treatment 
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Table 6. Overview of Linear Regression Model Structure for H4 and H5 


Model Structure 


a 2011-2012 math achievement/gain = sex + free and reduced price lunch eligibility + race + 
special education eligibility + 2010-2011 math performance level + 2010-2011 RLA 
performance level + Grade 6 + Grade 7 + Grade 8 


2 2011-2012 math achievement/gain = sex + free and reduced price lunch eligibility + race + 
special education eligibility + 2010-2011 math performance level + 2010-2011 RLA 
performance level + Grade 6 + Grade 7 + Grade 8 + treatment 
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Hypothesis 1 


Hypothesis 1 stated, “Teachers who participate in training provided as part of the 
Middle School Algebra Readiness Initiative (MSARJ) will exhibit significantly greater post- 
test scores on the Learning Mathematics for Teaching (LMT) patterns, functions, and alge- 
bra assessment.” Thirty-six teachers completed the LMT pretest assessment. Their 
standardized LMT scores ranged from -1.21 to 1.66, with a mean score of .19 (sd = .75) for 
the group. This corresponds to, on average, answering approximately 18 of the 33 questions 
correctly at pretest. 


Because we required both a pretest and posttest record to complete the gains analy- 
sis, those teachers who completed a pretest, but not a posttest, were necessarily excluded 
from the sample (n = 16). With this adjustment, the final sample size for our gains analysis 
was 20. The average pretest score for the 20 teachers who completed both a pretest and 
posttest was .43 (sd = .60), which indicates, on average, the 20 teachers who completed both 
a pretest and posttest answered approximately 20 of the 33 questions correctly at pretest, 
slightly higher than for the full sample. 


The posttest scores for these 20 teachers ranged from -1.06 to 1.36 with a mean of .51 
(sd = .65). This illustrates an average pretest to posttest gain of only .o8. Put another way, 
on average, teachers answered approximately 20 of the 33 questions correctly at posttest. As 
we indicated earlier, this was the same number of pretest questions answered correctly. 
Therefore, there was no discernible 
difference in the average number of 
correct responses between pretest 
and posttest. 


Figure1. LMT Patterns, Functions, and Algebra 
Pretest and Posttest Scores for MSARI 


Table 7 presents the results of 
the paired samples ¢ test analysis 
used to determine if pretest and post- 
test scores differed significantly. This 
difference was not statistically signif- 
icant t(19) = .513, p = .61. This result 
indicates that, for the 20 teachers 
who completed both a pretest and 
posttest, their content and pedagogi- 
cal knowledge of patterns, functions, 
and algebra, though increasing slight- 
ly, was not significantly greater at the 
conclusion of the project. Figure 1 
provides a graphical representation 
of the average pretest and posttest 
scores. 
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Table 7. Significance Test for Difference in LMT Pretest and Posttest Scores 


Standard error 
Pair Mean sd of the mean t df p 


Post—Pre .08 71 16 513 19 .61 


Hypothesis 2 


Hypothesis 2 stated, “Students who use the MATHia software during the 2011-2012 
school year, regardless of their level of exposure, will score significantly higher on the 
WESTEST 2 math subtest than students who do not use the software.” Table 8 presents the 
results of the t test analyses by grade level. 


Table 8. T Test Results for Hypothesis 2 


Grade T mean (sd) C mean (sd) t df p Pe Significant? 
2011-2012 math achievement 
6 20 (.91) 30 (.93) -1.632 868 103 -.10 NO 
7 13 (.95) 26 (.97) -2.087 930 .037 -.13 YES 
8 -.27(.96)  -.00 (.92) -4.017 748 -000 «27 YES * 
2010-2011 to 2011-2012 math gains 
6 -.14(.63) — -.03 (.64) -2.550 868 011 nat YES 
7 -.06 (.70) 03 (.66) -2.250 930 025 -.10 YES 
8 -.14 (.77) 09 (.77) -4,33 748 -000 -.24 YES * 


*Recall that, for these analyses, the treatment group included a significantly greater proportion of students 
who were special education eligible than the comparison group. As such, we recommend caution 
interpreting these results. 


As is evidenced in Table 8 and Figure 2 through Figure 7, despite starting at compa- 
rable points in 2011 math and RLA achievement and having remarkably similar demograph- 
ic characteristics, students in the treatment group in Grades 7 and 8 scored significantly 
lower than students in the comparison group on the WESTEST 2 math assessment in 2012. 
Additionally, treatment group students in all three grades exhibited significantly lower math 
gains from 2010-2011 to 2011—2012 than students in the comparison group. These findings 
were counter to our hypothesis. 
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Figure 2. Grade 6 Mathematics Achievement Figure 3. Grade 5-to-6 Mathematics Gain 
Treatment vs. Control—All MATHia Treatment vs. Control—All MATHia 
Users _ Users 
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Figure 6. 
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Tables Ci—C18 in Appendix C (page 39) present detailed results of the six general lin- 
ear models we used to test the explanatory power of the treatment on students’ math 
achievement and gains after accounting for all measured covariates. Table 9 below provides 
a summary of those models. As displayed below, the treatment coefficient was a statistically 
significant and negative predictor in all six models. However, it should be noted that treat- 
ment only contributed minor explanatory power after accounting for covariates (i.e., be- 
tween .3% and 2% of the variance). 


Table 9. Abbreviated Linear Model Summaries for Hypothesis 2 
Standard- 
pvalue for _ ized B® for 
Model 1 Model2 pvaluefor treatment treatment 
Model adj.R? R* change model 2 coefficient coefficient Interpretation 
Grade 6 math .623 .003 .000 -015 -.051 The treatment was a 
achievement statistically 
ignificant and 
Grade 6 math gains 239 005 000 015 ee es 
negative predictor in 
Grade 7 math 595 003 000 -008 -.055 all models, but only 
achievement contributed between 
Grad h gai 203 006 000 -008 i 
rade 7 math gains : : : -.07 explanatory power 
Grade 8 math 435 017 .000 -000 -.129 after accounting for 
achievement covariates. 
Grade 8 math gains 166 024 .000 -000 -.157 
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Hypothesis 3 


Hypothesis 3 stated, “Students who have been continuously enrolled in a classroom 
where MATHia is being utilized for at least 210 days during the 2011-2012 school year will 
score significantly higher on the WESTEST 2 math subtest than students who do not use the 
software.” Table 10 presents the results of the t test analyses by grade level. 


Table 10. 7 Test Results for Hypothesis 3 


T Mean C Mean Mean 
Grade (sd) (sd) t df p difference Significant? 


2011-2012 math achievement 


6 -.03 (.79) 22 (.84) -2.80 316 005 -.26 YES 
7 02 (.94) 04 (.94) -.268 478 789 a02 NO 
8 -.31 (.93) 06 (.87) -3.92 334 .000 -.38 YES 


2010-2011 to 2011-2012 math gains 


6 -.30(.60) — -.08 (.60) ~3.33 316 001 “22 YES 
7 -.03(.67)  -.03 (.77) 048 478 962 00 NO 
8 -.19 (.81) 14 (.81) -3.92 334 .000 -3.4 YES 


As is evidenced in Table 10 and Figure 8 through Figure 13, despite starting at com- 
parable points in 2011 math and RLA achievement and having remarkably similar demo- 
graphic characteristics, students in the treatment group in Grades 6 and 8 scored 
significantly lower than students in the comparison group on the WESTEST 2 math assess- 
ment in 2012. Additionally, treatment group students in Grades 6 and 8 exhibited signifi- 
cantly lower math gains from 2010-2011 to 2011-2012 than students in the comparison 
group. There were no differences in grade 7. These findings were counter to our hypothesis. 


Figure 8. Grade 6 Mathematics Achievement Figure 9. Grade 5-to-6 Mathematics Gain 
Treatment vs. Control—210 Days in Treatment vs. Control—210 Days in 
MATHia 7 MATHia 
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Figure 10. Grade 7 Mathematics Achievement Figure 11. Grade 6-to-7 Mathematics Gain 
Treatment vs. Control—210 Days in Treatment vs. Control—210 Days in 
MATHia MATHia 
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Figure 12. Grade 8 Mathematics Achievement Figure 13. Grade 7-to-8 Mathematics Gain 
Treatment vs. Control—210 Days in Treatment vs. Control—210 Days in 
MATHia MATHia 
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Tables Cig—C36 in Appendix C present detailed results of the six general linear mod- 
els we used to test the explanatory power of the treatment on students’ math achievement 
and gains after accounting for all measured covariates. Table 11 below provides a summary 
of those models. As displayed below, the treatment coefficient was a statistically significant 
and negative predictor in four of the six models—it was not a significant predictor of Grade 7 
math achievement or gains. However, it should be noted that, when significant, the treat- 
ment coefficient only contributed minor explanatory power to the model after accounting for 
covariates (i.e., between 1.5% and 4.4% of the variance). 


Table 11. Abbreviated Linear Model Summaries for Hypothesis 3 


p value for Standardized B 


Model 1 Model2 pvaluefor treatment for treatment 
Model Adj.R2- R2 change model2 coefficient coefficient Interpretation 
Grade 6 Math .618 .015 .000 .000 -.121 The treatment was a 
Achievement statistically significant 
Grade 6 Math .266 .028 .000 .000 -.168 and negative 
Gains predictor in all but 
Grade 7 Math 525 .000 .000 .734 -.011 two models. When 
Achievement significant, the 
Grade 7 Math .191 .000 .000 .734 -.014 treatment only 
Gains contributed between 
Grade 8 Math 349 .036 .000 .000 -.191 1.5% and 4.4% 
Achievement explanatory power 
Grade 8 Math .203 044 .000 .000 -.211 after accounting for 
Gains covariates. 

Hypothesis 4 


Hypothesis 4 stated, “Students who have used the MATHia software during the 
2011-2012 school year for the recommended minimum of at least 1 hour per week will score 
significantly higher on the WESTEST 2 math subtest than students who have not used the 
software.” Table 12 presents the results of the ¢ tests, aggregating all three grade levels. 


Table 12. 7 Test Results for Hypothesis 4 


Mean 
Grade T Mean (sd) C Mean (sd) t df p _ difference Significant? 
2011-2012 math achievement 
6, 7, and 8 -.25 (.85) .05 (.76) -3.327 294 .001 -.31 YES 
2010-2011 to 2011-2012 math gains 
6, 7, and 8 -.16 (.76) .08 (.60) -3.103 279.092 .002 -.25 YES 


As is evidenced in Table 12 and in Figure 14 and Figure 15, despite starting at compa- 
rable points in 2011 mathematics and reading/language arts achievement and having re- 
markably similar demographic characteristics, students in the treatment group scored 
significantly lower than students in the comparison group on the WESTEST 2 mathematics 
assessment in 2012. Additionally, treatment group students exhibited significantly lower 
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mathematics gains from 2010-2011 to 2011-2012 than students in the comparison group. 
These findings were counter to our hypothesis. 


Figure 14. All Grades Mathematics Figure 15. 2011-to-2012 Mathematics Gain, 
Achievement, Treatment vs. Treatment vs. Control—At Least 1 
Control—At Least 1 Hour/Week 1 Hour/Week 


go oo 2 
UBD & 


ae. Comparison 


0.05 


oO 


0.03 nn 
0.09 ———_ 
-0.26 


WESTEST 2 Math WESTEST 2 Math 
2011 2012 


9 9 9 9 
oa ae 


One Year Mathematics Gain (Z-Score) 


ov 
i 
3 
o 
wn 
N 
n 
= 
~~ 
oO 
E 
et) 
a =f 
~ 
oO 
= 
N 
-_ 
a) 
lu 
_ 
a) 
lu 


1 
ay 


Grou 
-1 P 


——Treatment -— Comparison @ WESTEST 2 Math Gain 2011 to 2012 


Tables C37—C42 in Appendix C present detailed results of the two general linear 
models that we used to test the explanatory power of the treatment on students’ math 
achievement and gains after accounting for all measured covariates. Table 13 below provides 
a summary of those models. As displayed below, the treatment coefficient was a statistically 
significant and negative predictor in both models. Though significant, the treatment coeffi- 
cient only contributed minor explanatory power to the model after accounting for covariates 
(i.e., between 2.6% and 3.4% of the variance). 


Table 13. Abbreviated Linear Model Summaries for Hypothesis 4 


Standardized 


p value for 6 for 
Model 1 Model2 pvaluefor treatment treatment 
Model Adj. R? R? change model2 coefficient coefficient Interpretation 
All grades math .476 .026 .000 .000 -.161 The treatment was a 
achievement statistically significant and 
negative predictor in both 
All grades math 120 034 000 001 -.1g6 Models, but only 


contributed between 2.6% 
and 3.4% explanatory 
power after accounting for 
covariates. 


gains 
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Hypothesis 5 


Hypothesis 5 stated, “Students who have used the MATHia software during the 
2011-2012 school year for the recommended minimum of at least 1 hour per week will score 
significantly higher than students who have used MATHia for less than the recommended 
minimum of at least 1 hour per week.” Table 14 presents the results of the t test analyses, ag- 
gregating all three grade levels. 


Table 14. 7 Test Results for Hypothesis 5 


Mean 
Grade T mean (sd) C mean (sd) t df p difference Significant? 
2011-2012 math achievement 
6, 7, and 8 -.24 (.83) -.14 (.93) -.951 286 340 -.09 NO 
2010-2011 to 2011-2012 math gains 
6, 7, and 8 -.15 (.75) -.10 (.73) -.618 286 537 -.05 NO 


As is evidenced in Table 14 and in Figure 16 and Figure 17, there were no statistically 
significant differences between high and low use students with respect to 2012 math 
achievement or math gains from 2010-2011 to 2011-12. 


Tables C43—C48 in Appendix C present detailed results of the two general linear 
models we used to test the explanatory power of the treatment on students’ math achieve- 
ment and gains after accounting for all measured covariates. Table 15 below provides a 
summary of those models. As displayed below, the treatment coefficient was not statistically 
significant in either model. 


Figure 17. 2011-to-2012 Mathematics Gain 
High Treatment vs. Low Treatment 


Figure 16. All Grades Mathematics Achievement 
High Treatment vs. Low Treatment 
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Table 15. Abbreviated Linear Model Summaries for Hypothesis 5 


Standardize 


p value for d 8 for 
Model 1 Model2 pvaluefor treatment treatment 
Model Adj.R?2  R2 change model 2 coefficient coefficient Interpretation 
All grades math .462 .001 .000 415 -.036 The treatment was not 
achievement a statistically 
significant predictor in 
; either model. 
All grades math gains .081 .003 .000 .355 -.053 
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Hypothesis 1 posited that, “Teachers who participate in training provided as part of 
the Middle School Algebra Readiness Initiative (MSARI) will exhibit significantly greater 
posttest scores on the Learning Mathematics for Teaching (LMT) patterns, functions and 
algebra assessment.” Based on the results of our analyses, we rejected this hypothesis. We 
found that teachers’ performance on a rigorously developed and research validated teacher 
assessment of content and pedagogical knowledge remained virtually static from pretest to 
posttest. There was a negligible gain for the participating teachers who completed both the 
pretest and posttest, but statistical tests revealed that this gain was statistically insignificant. 


There are multiple potential explanations for this finding including a possible lack of 
quality in the training provided, poor retention of the material on the part of participating 
teachers, or a low degree of alignment between the training and the content that appears on 
the LMT assessment. Without additional contextual knowledge, we can only speculate. We 
must acknowledge several limitations in our ability to thoroughly test Hypothesis 1. First, 
our final analysis was limited by the fact that we received completed LMT posttests for only 
20 of the original 36 teachers who completed the pretest at the outset of the study (55%). 
Because we required a completed pretest and posttest record for our analyses, this limited us 
to 20 cases for analysis. Approximately 44 cases would be required to have confidence in our 
ability to detect statistically significant, but small effect sizes. It is unclear from this study 
why the remaining 16 teachers did not complete the posttest. It is possible that they simply 
opted out of taking the assessment given its voluntary nature, but it is equally possible that 
they ceased participating in the initiative altogether. It was clear from a post-hoc examina- 
tion of the pretest results that the average score for the 20 teachers who ended up persisting 
throughout the entire initiative and who ultimately completed a posttest was higher than the 
pretest score for all 36 teachers—.43 versus .19, a difference of .24. In terms of a raw score, 
the average for all 36 pretest completers was two points lower than the score for those pre- 
test completers who persisted long enough to complete a posttest. Given these findings, it 
appears that those teachers who persisted in the initiative were potentially more knowledge- 
able in the concepts measured by the LMT than those who did not persist. This uncertainty 
raises questions about the degree to which the outcomes we observed in our study would be 
different if all pretested teachers were included. 


Hypotheses 2 through 5 were concerned with ascertaining the impact of participating 
in the MSARI on students’ math achievement and gains. In all cases, the results indicated to 
us that we should reject our conjecture that treatment group students would outperform 
comparison group students. In fact, in most cases, students in the treatment condition ex- 
hibited lower math achievement and gains than students in the comparison condition. In 
almost all instances, these differences were statistically significant. This unanticipated and 
negative relationship persisted when examining only those students who met the limited 
fidelity criteria recommended by Carnegie Learning and after accounting for the impact of 
multiple covariates. The relationship was strongest in our analyses of Grade 8 outcomes. 
However, we must acknowledge that once covariates were controlled for, the negative rela- 
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Discussion 


tionship between treatment and students’ math achievement and gains was very small, ac- 
counting for less than 5% of the total variance in all tested models. 


The persistence of this negative relationship is quite troubling. Consider the results 
of Hypothesis 4, which illustrate that prior to the intervention, both groups of students ex- 
hibited math achievement that placed them just below the state average when compared 
with their grade-level peers. The difference between these groups at baseline was not statis- 
tically significant. However, after the intervention year, students in the treatment condition, 
all of whom used the MATHia software for one or more hours a week, declined in achieve- 
ment to land approximately a quarter of a standard deviation below the state average—a 
considerable deficit. Meanwhile, their peers in the comparison group who did not implement 
MATHia managed to achieve a small gain, which placed them at the state average when 
compared with their grade-level peers+. This same trend was evidenced with respect to 
Grade 8 in Hypothesis 2 and Hypothesis 3. Also consider the results of Hypothesis 2, where 
both groups of sixth grade students started approximately one third of a standard deviation 
above the state average prior to the intervention, a considerable advantage. Again, this dif- 
ference at baseline was not statistically significant. However, at the conclusion of the inter- 
vention year, students in the comparison condition had regressed to land nearly a quarter 
standard deviation above the state average while students in the treatment condition, all of 
whom had at least 210 days between their first and last MATHia session, regressed to the 
state average. To be absolutely clear, we are not stating that the intervention caused these 
negative effects, but nevertheless, this is what we have observed in a defensibly designed 
quasi-experiment. 


We must caution readers of this report against interpreting these results as definitive 
evidence of the general efficacy of the MATHia software and accompanying classroom cur- 
riculum for multiple reasons. First, our evaluation was never intended nor was it adequately 
designed to make judgments about the quality of the MATHia software program or curricu- 
lum itself. Rather, our goal was to ascertain the impact of two districts’ individual implemen- 
tations of that curriculum on teacher knowledge and student achievement on the state 
summative assessment, WESTEST 2. It would require a complex experimental design study 
with random assignment to fairly evaluate the program itself. Second, for our evaluation to 
stand as a fair trial of the program’s efficacy, we would require detailed information about 
the degree to which the program was implemented with fidelity in participating classrooms. 
Unfortunately, a major limitation of this study is that very little is known to us about the 
quality of implementation in these two districts. The fidelity metric available to us only ad- 
dressed the quantity of time students spent using the software program. We did not have 
access to any data that would stand as a suitable proxy measure of the quality of time spent 
in either the computer lab or the classroom. What we do know is that, based on data provid- 
ed by Carnegie Learning, very few students met the recommended 1.5 hours per week spent 
using the software program. In fact, even once we relaxed our fidelity criteria to 1 or more 
hours per week spent with the software program, we found that there were very few students 
that met this criterion. This is a strong indication that there was a significant gap in imple- 


4 We must stress that, as noted earlier, despite confirming matching across these two groups, 
this analysis does not take into account the influence of covariates on student achievement. 
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mentation fidelity. Also, the fact that we observed no significant differences in the mathe- 
matics achievement/gains of two matched groups of students, both of which used the MA- 
THia software/curriculum, but which either did or did not meet the fidelity criteria rec- 
recommended by the vendor, illustrates that the quality of implementation in the high use 
group may have been less than ideal. 


Given these limitations we must restate that we do not support using the results of 
this evaluation as a broader evaluation of the MATHia software program or classroom cur- 
riculum. Instead, we suggest interpreting these results as evidence regarding the importance 
of consistent and careful monitoring when implementing an intervention of this nature. We 
can only speculate that failure to implement with fidelity is what contributed to the lacklus- 
ter student outcomes we observed. 


As we alluded to above, our interpretation of these results is also exacerbated by a 
lack of knowledge about how well each classroom implemented the accompanying curricu- 
lum. Based on a cursory review of the standards assessed by the WESTEST 2 and the curric- 
ulum materials available from Carnegie Learning, we believe the curriculum selected has a 
generally quite reasonable alignment with the content assessed on the Grade 6, Grade 7, and 
Grade 8 math assessments. Therefore, it was a reasonable assumption that this curriculum, 
implemented with fidelity, would contribute to increase math knowledge on the part of stu- 
dents. However, if teachers did not progress fully through the curriculum prior to the admin- 
istration of WESTEST 2 or if the quality of their implementation of the curriculum was 
suspect, that could certainly explain some of the results we observed. Without answers to 
these critical questions, we are left wondering and have very little conclusive knowledge of 
the reasons behind our results. 


Other outstanding questions from this study include why there was such a discrepan- 
cy in the number of students provided to us by the school districts vs. Carnegie Learning. It 
is possible that districts systematically excluded some classrooms from the lists they provid- 
ed or dropped classes from the initiative early on. Without additional contextual infor- 
mation, we cannot be sure. We also know very little about the quality of the training 
provided to teachers and if this training was focused on content knowledge and pedagogy 
alone or also on appropriate implementation of the software program and accompanying 
curriculum. 
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Recommendations 


We make only two recommendations based on these results. First, future evaluations 
of this nature should include a great deal more data collection related to fidelity of imple- 
mentation. Without this information, we are left to speculate the context surrounding the 
results we observed. It is possible that Carnegie Learning collects nuanced information in 
this regard, but it was not available for this report. Even if such information were provided, 
we did not have the manpower sufficient to analyze these data or to collect additional quali- 
tative data regarding this aspect of the project. Future projects should devote at least some 
portion of the study budget to program evaluation so that it is not completely undertaken as 
an in-kind effort. Second, it is clear that close monitoring and technical assistance are criti- 
cal to ensuring that this type of program is implemented appropriately. Deviations from ap- 
propriate implementation have unintended effects. It is apparent that some monitoring did 
take place throughout the project, but continued close observation and technical assistance 
are necessary to ensure the program is successful. If other school districts are implementing 
this program, we recommend some level of ongoing monitoring take place. This monitoring 
should include measuring the amount and quality of time spent in the computer laboratory 
and the level of individual classroom-level progress through the curriculum, as well as the 
extent to which teachers deliver the curriculum as intended. 
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Appendix A. Covariate Balance Summaries for Student-level 
Analyses 


Table A1. Covariate Balance Summary for SF16 


% 


Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 

Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0262 .0224 .0037 = 0.0261 0.001 99.47 
Score 
2010-2011 466.11 452.60 13.50 * 466.96 -0.852 93.68 
RLA 
2010-2011 618.41 601.31 17.09 7 617.96 0.441 97.41 
MATH 
FRPL AMG 547 -.102 _ 0.448 -0.002 97.74 
SPED 09 13 -.046 * 0.057 0.029 35.47 
RACE 046 .08 -.034 * 0.048 -0.002 93.26 
*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
Table A2. Covariate Balance Summary for SF17 

% 
Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 

Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0.0285 0.0245 0.003 * 0.0285 0.000 99.99 
Score 
2010-2011 482.37 467.46 14.90 * 482.40 -0.038 99.74 
RLA 
2010-2011 627.03 619.01 8.02 * 628.41 -1.38 82.77 
MATH 
FRPL 0.437 0.524 -0.086 bs 0.448 -0.010 87.55 
SPED 0.049 0.119 -0.070 * 0.036 0.012 81.69 
RACE 0.021 0.078 -0.056 = 0.023 -0.002 96.21 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
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Appendix A. Covariate Balance Summaries for Student-level Analyses 


Table A3. Covariate Balance Summary for SF18 


% 
Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 
Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0.0207 0.0196 0.001 * 0.0207 0.000 99.97 
Score 
2010-2011 478.17 477.24 0.934 bs 478.30 -0.128 86.30 
RLA 
2010-2011 625.39 633.37 -7.977 e 626.95 -1.562 80.41 
MATH 
FRPL 0.517 0.512 0.005 * 0.504 0.013 -156.50** 
SPED 0.117 0.122 -0.004 * 0.072 0.045 -830.31** 
RACE 0.061 0.076 -0.015 * 0.053 0.008 47.77 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were close to 
zero indicating this may be of little concern. 


Table A4. Covariate Balance Summary for SF26 


% 
Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 
Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0.0093 0.0083 0.001 * 0.0093 0.000 99.94 
Score 
2010-2011 457.77 452.60 5.17 * 431.11 -3.339 35.47 
RLA 
2010-2011 614.74 601.31 13.43 * 616.69 -1.94 85.48 
MATH 
FRPL 0.528 0.547 -0.019 * 0.534 -0.006 67.76 
SPED 0.113 0.133 -0.020 * 0.075 0.037 -84.42** 
RACE 0.025 0.080 -0.054 i 0.031 -0.006 88.55 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were close to 
zero indicating this may be of little concern. 
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Appendix A. Covariate Balance Summaries for Student-level Analyses 


Table A5. Covariate Balance Summary for SF27 


% 


Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 
Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0.0153 0.0128 0.0025 = 0.0153 0.000 99.99 
Score 
2010-2011 480.71 467.46 13.24 " 480.07 0.633 95.21 
RLA 
2010-2011 620.81 619.01 1.80 * 621.71 -0.900 50.14 
MATH 
FRPL 0.537 0.524 0.013 * 0.554 -0.016 -23.10** 
SPED 0.062 0.119 -0.057 = 0.058 0.004 92.71 
RACE 0.016 0.078 -0.061 * 0.012 0.004 93.22 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were close to 


zero indicating this may be of little concern. 


Table A6. Covariate Balance Summary for SF28 


% 


Improveme 
Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean nt post- 
Covariate Pre Mean Pre Mean diff Post Mean Post Mean diff matching 
Propensity 0.0104 0.0089 0.0015 ~ 0.0104 0.000 99.99 
Score 
2010-2011 483.04 477.24 5.79 * 483.76 -0.726 87.47 
RLA 
2010-2011 626.01 633.37 -7.35 mi 627.88 -1.86 74.66 
MATH 
FRPL 0.541 0.512 0.029 - 0.506 0.035 -20.93 
SPED 0.065 0.122 -0.056 ir 0.053 0.011 79.01 
RACE 0.053 0.076 -0.023 ba 0.041 0.011 48.42 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were close to 


zero indicating this may be of little concern. 
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Appendix A. Covariate Balance Summaries for Student-level Analyses 


Table A7. Covariate Balance Summary for SF3 


Covariate Treatment | Comparison Pre Mean Treatment | Comparison | Post Mean % 
Pre Mean Pre Mean diff Post Mean Post Mean diff Improveme 

nt post- 
matching 

Propensity 0.0031 0.0026 0.0005 * 0.0031 0.000 99.99 

Score 

2010-2011 466.54 465.73 0.807 bs 466.20 0.337 58.15 

RLA 

2010-2011 614.09 617.85 -3.76 e 617.26 -3.16 15.72 

MATH 

FRPL 0.587 0.528 0.059 * 0.567 0.020 66.10 

SPED 0.087 0.125 -0.037 * 0.067 0.020 45.80 

RACE 0.020 0.078 -0.058 * 0.020 0.000 100.00 

GRD6 0.337 0.336 0.001 * 0.331 0.006 -466.34** 

GRD7 0.229 0.329 -0.099 * 0.223 0.006 93.19 

GRD8 0.432 0.334 0.098 * 0.445 -0.013 86.22 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were close to 
zero indicating this may be of little concern. 
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Appendix A. Covariate Balance Summaries for Student-level Analyses 


Table A8. Covariate Balance Summary for SF4 


Covariate Treatment | Comparison Pre Treatment | Comparison Post % 
Pre Mean Pre Mean Mean | Post Mean Post Mean Mean diff | Improvement 
diff post-matching 
Propensity 0.1395 0.1095 | 0.030 ” 0.1395 0.000 99.93 
Score 
2010-2011 466.68 476.72 | -10.04 * 465.84 0.833 91.70 
RLA 
2010-2011 613.93 624.84 | -10.91 ss 615.65 -1.722 84.22 
MATH 
FRPL 0.583 0.448 | 0.134 * 0.555 0.027 79.35 
SPED 0.090 0.081 | 0.009 ” 0.097 -0.006 22.88 
RACE 0.020 0.044 | -0.023 ss 0.006 0.013 40.48 
GRD6 0.347 0.340 | 0.007 * 0.361 -0.013 -95.17** 
GRD7 0.222 0.383 | -0.161 * 0.215 0.006 95.69 
GRD8 0.430 0.276 | 0.154 = 0.423 0.006 95.49 


*Treatment post mean is identical to treatment pre mean because no cases were discarded. 
**The model decreased the balance across groups for this variable. However, post mean differences were 
close to zero indicating this may be of little concern. 
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Appendix B. Power Analyses for Student-level Analyses 


Table B1. Sample Size and Power by Sampling Frame (Hypothesis 2) 


Sample Comparison Treatment Power to Detect 
Moderate Effect 
(a > .95) 
SF16 
All 18,875 435 
Matched 435 435 
Unmatched 18,440 0 
Final N 870 YES 
SF17 
All 18446 466 
Matched 466 466 
Unmatched 17980 0 
Final N 932 YES 
SF18 
All 18747 375 
Matched 375 375 
Unmatched 18372 0 
Final N 750 YES 
Table B2. Sample Size and Power by Sampling Frame (Hypothesis 3) 
Sample Comparison Treatment Power to Detect 
Moderate Effect 
(a = .95) 
SF26 
All 18875 159 
Matched 159 159 
Unmatched 18716 0 
Final N 318 NO (.76) 
SF27 
All 18446 240 
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Appendix B. Power Analyses for Student-level Analyses 


Table B2. Sample Size and Power by Sampling Frame (Hypothesis 3) 


Sample Comparison Treatment Power to Detect 
Moderate Effect 
(a = .95) 
Matched 240 240 
Unmatched 18,206 0 
Final N 480 NO (.90) 
SF28 
All 18,747 168 
Matched 168 168 
Unmatched 18,579 0 
Final N 336 NO (.78) 
Table B3. Sample Size and Power by Sampling Frame (Hypothesis 4) 
Sample Comparison Treatment Power to Detect 
Moderate Effect 
(a > .95) 
SF3 
All 56068 148 
Matched 148 148 
Unmatched 55920 0 
Final N 296 NO (.73) 
Table B4. Sample Size and Power by Sampling Frame (Hypothesis 5) 
Sample Comparison Treatment Power to Detect 
Moderate Effect 
(a > .95) 
SF4 
All 1132 144 
Matched 144 144 
Unmatched 988 0 
Final N 288 NO (.72) 
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Appendix C. Detailed Statistics for Linear Models Used to Test 
the Impact of Treatment when Accounting for Measured 
Covariates 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C1. Model Summaries for SF16 (Math Achievement) 


Model Summary 


Model R Square |Adjusted R Square Std. Error of the Change Statistics 
Estimate 
R Square Change | F Change Sig. F Change 


.568752161645 .626 240.367 6 863 .000 
-567148612508) .003 5.887 1 862 .015 
a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C2. ANOVA Statistics for SF16 (Math Achievement) 


ANOVA? 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: ZSSM12 
b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C3. Coefficient Summaries for SF16 (2011-2012 Math Achievement) 


Coefficients* 


Unstandardized Coefficients | Standardized Coefficients _| Coefficients 95. | 95.0% Confidence Interval for B | | 95.0% Confidence Interval for B | Interval for B 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 


42 | The Middle School Algebra Readiness Initiative 


Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C4. Model Summaries for SF16 (Math Gains) 


Model Summary 


Model RSquare |Adjusted R Square| Std. Error of the Change Statistics 
Estimate 
R Square Change F Change Sig. F Change 
228 6 863 000 


.568752161648 
.567148612511 


a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C5. ANOVA Statistics for SF16 (Math Gains) 


ANOVA? 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: MATHGAIN 
b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C6. Coefficient Summaries for SF16 (Math Gains) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
Coefficients 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C7. Model Summaries for SF17 (Math Achievement) 


Model Summary 


Model R Square {Adjusted R Square Std. Error of the Change Statistics 
Estimate 
R Square Change | F Change Sig. F Change 


.616209861140 598 229.066 6 925 .000 
-614210903534 .003) 7.031 1 924 .008 
a. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11, treat 


Table C8. ANOVA Statistics for SF17 (Math Achievement) 


ANOVA? 


Regression : 229.066 .000b 
Residual : ‘ 

Total : 

Regression : . 198.627 .000c 
Residual , . 

Total ; 


a. Dependent Variable: ZSSM12 


b. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C9. Coefficient Summaries for SF17 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
Coefficients 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C10. Model Summaries for SF17 (Math Gains) 


Model Summary 


Model R Square |Adjusted R Square Std. Error of the Change Statistics 
Estimate 
R Square Change | F Change Sig. F Change 
208 6 925 000 


.208 F .616209861137 
2 F ‘ : .614210903531 


a. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11, treat 


Table C11. ANOVA Statistics for SF17 (Math Gains) 


ANOVA? 


Regression 40.577 
Residual : : 

Total : 

Regression : ; 36.012 
Residual . . 

Total . 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, WHITE11, LSES11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C12. Coefficient Summaries for SF17 (Math Gains) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
Coefficients 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C13. Model Summaries for SF18 (Math Achievement) 


Model Summary 


Model RSquare |Adjusted R Square| Std. Error of the Change Statistics 
Estimate 
R Square Change F Change Sig. F Change 


.718734366395 .440 97.220 6 743 .000 
-708537706902 .017 22.539 1 742 -000 
a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C14. ANOVA Statistics for SF18 (Math Achievement) 


ANOVA? 


Regression 97.220 .00O0b 
Residual : : 

Total . 

Regression : : 88.967 .000c 
Residual ; : 

Total . 


a. Dependent Variable: ZSSM12 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C15. Coefficient Summaries for SF18 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Standardized Sig. 95.0% Confidence Interval for B 
Coefficients 
.000 ; : 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C16. Model Summaries for SF18 (Math Gains) 


Model Summary 


Model R Square |Adjusted R Square] Std. Error of the Change Statistics 
Estimate 
R Square Change F Change Sig. F Change 


.718734366375 .173 25.824 6 743 .000 
-708537706884 .024 22.539 1 742 -000 
a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C17. ANOVA Statistics for SF18 (Math Gains) 


ANOVA? 


Regression 25.824 
Residual F : 

Total . 

Regression : ; 25.997 
Residual . ‘ 

Total F 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C18. Coefficient Summaries for SF18 (Math Gains) 


Coefficients* 


Unstandardized Coefficients oe 95.0% Confidence Interval for B 
oe 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C19. Model Summaries for SF26 (Math Achievement) 


Model Summary 


Model RSquare {Adjusted R Square] Std. Error of the Change Statistics 
Estimate ; 
R Square Change |} F Change Sig. F Change 
6 


; : .526955678942 
2 F F Z -517467436514 
a. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11, treat 


Table C20. ANOVA Statistics for SF26 (Math Achievement) 


ANOVA? 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: ZSSM12 
b. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C21. Coefficient Summaries for SF26 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Ee 95.0% Confidence Interval for B 
Ee 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C22. Model Summaries for SF26 (Math Gains) 


Model Summary 


Model RSquare |Adjusted R Square} Std. Error of the Change Statistics 
Estimate 
R Square Change F Change Sig. F Change 


, : .526955678968 .280 20.191 6 311 .000 
2 é F ‘ -517467436541 .028 12.510 1 310 -000 
a. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11, treat 


Table C23. ANOVA Statistics for SF26 (Math Gains) 


ANOVA? 


Regression ; 20.191 
Residual . : 

Total . 

Regression : : 19.735 
Residual : : 

Total ‘ 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, WHITE11, SPED11, LSES11, SSM11, SSR11, treat 


The Middle School Algebra Readiness Initiative | 55 


Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C24. Coefficient Summaries for SF26 (Math Gains) 


Coefficients* 


Unstandardized Coefficients | 95.0% Confidence Interval for B 
| 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 


56 | The Middle School Algebra Readiness Initiative 


Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C25. Model Summaries for SF27 (Math Achievement) 


Model Summary 


Model R Square Adjusted R Std. Error of the Change Statistics 
Square Estimate 
R Square Change F Change Sig. F Change 
531 6 473 000 


.652136894472 
-652747500812 


a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C26. ANOVA Statistics for SF27 (Math Achievement) 


ANOVA? 


Regression 89.121 
Residual ; F 

Total . 

Regression : ; 76.263 
Residual . : 

Total : 


a. Dependent Variable: ZSSM12 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C27. Coefficient Summaries for SF27 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients eed 95.0% Confidence Interval for B 
eed 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C28. Model Summaries for SF27 (Math Gains) 


Model Summary 


Model R Square Adjusted R__|Std. Error of the Change Statistics 
Square Estimate 
R Square Change F Change Sig. F Change 
202 6 473 000 


.191) .652136894458 
-190| .652747500798) 
a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C29. ANOVA Statistics for SF27 (Math Gains) 
ANOVA? 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: MATHGAIN 
b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C30. Coefficient Summaries for SF27 (Math Gains) 


Coefficients* 


Unstandardized Coefficients oad 95.0% Confidence Interval for B 
oad 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C31. Model Summaries for SF28 (Math Achievement) 


Model Summary 


Model R Square {Adjusted R Square| Std. Error of the Change Statistics 
Estimate 
R Square Change F Change Sig. F Change 
361 6 329 000 


.744333247367 
-724035146399 


a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C32. ANOVA Statistics for SF28 (Math Achievement) 


ANOVA? 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: ZSSM12 
b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C33. Coefficient Summaries for SF28 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
Coefficients 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C34. Model Summaries for SF28 (Math Gains) 
Model Summary 


R Square {Adjusted R Square] Std. Error of the Change Statistics 


Estimate 
R F df df Sig. F Change 
Square Change Change 1 2 


.744333247341 : 15.229 6 329 .000 
.724035146373 : 19.705 1 328) .000 


a. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 


Table C35. ANOVA Statistics for SF28 (Math Gains) 
ANOVA? 


Regression 2 15.229 .000b 
Residual : : 

Total ; 

Regression : ; 16.611 .000c 
Residual : : 

Total ; 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11 


c. Predictors: (Constant), SEX11, LSES11, WHITE11, SPED11, SSM11, SSR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C36. Coefficient Summaries for SF28 (Math Gains) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
Coefficients 


(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
SSM11 
SSR11 
WHITE11 
LSES11 
SPED11 
SEX11 


treat 


a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C37. Model Summaries for SF3 (Math Achievement) 


Model Summary 


Model RSquare {Adjusted R Square] Std. Error of the Change Statistics 
Estimate 
R Square Change | F Change Sig. F Change 
491 8 287 000 


‘ : .594238923336 
2. : : : .580011587217 


a. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11 


b. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11, treat 


Table C38. ANOVA Statistics for SF3 (Math Achievement) 


ANOVA? 


Regression 34.556 
Residual i : 

Total . 

Regression : . 33.936 
Residual . : 

Total : 


a. Dependent Variable: ZSSM12 


b. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


c. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11, treat 


Table C39. Coefficient Summaries for SF3 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
a el 


pst error | eta 


Upper Bound 


(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
treat 

a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C40. Model Summaries for SF3 (Math Gains) 


Model Summary 


Model R Square |Adjusted R Square} Std. Error of the Change Statistics 
Estimate 
R Square Change | _ F Change Sig. F Change 
143 8 287 000 


.660594616962 
2 we .178 : .648318321312 
a. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11 


b. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11, treat 


Table C41. ANOVA Statistics for SF3 (Math Gains) 
ANOVA? 


Regression : 6.008 
Residual , 3 

Total . 

Regression : : 6.875 
Residual ; . 

Total : 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11 


c. Predictors: (Constant), SEX11, WHITE11, GRD6, LSES11, SPED11, PLM11, GRD7, PLR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C42. Coefficient Summaries for SF3 (Math Gains) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
ee 


(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
treat 

a. Dependent Variable: MATHGAIN 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C43. Model Summaries for SF4 (Math Achievement) 


Model Summary 


Model R Square |Adjusted R Square} Std. Error of the Change Statistics 
Estimate 
R Square Change | F Change Sig. F Change 
477 8 279 000 


: .647828147782 
: -648215096971 


a. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11 


b. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11, treat 


Table C44. ANOVA Statistics for SF4 (Math Achievement) 
ANOVA® 


Regression 
Residual 


Total 


Regression 


Residual 


Total 


a. Dependent Variable: ZSSM12 
b. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11 


c. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C45. Coefficient Summaries for SF4 (Math Achievement) 


Coefficients* 


Unstandardized Coefficients Standardized 95.0% Confidence Interval for B 
aa 


(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
treat 

a. Dependent Variable: ZSSM12 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table C46. Model Summaries for SF4 (Math Gains) 


Model Summary 


Model R Square |Adjusted R Square] Std. Error of the Change Statistics 
Estimate 
R Square Change |_ F Change Sig. F Change 
107 8 279 


: : .714297081012 
2 ‘ : 7 -714478631858 


a. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11 


b. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11, treat 


Table 47. ANOVA Statistics for SF4 (Math Gains) 


ANOVA? 


Regression ‘ 4.160 
Residual ; : 

Total . 

Regression . ‘ 3.791 
Residual F : 

Total ; 


a. Dependent Variable: MATHGAIN 


b. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11 


c. Predictors: (Constant), SEX11, LSES11, SPED11, GRD7, WHITE11, PLM11, GRD6, PLR11, treat 
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Appendix C. Detailed Statistics for Linear Models Used to Test the Impact of Treatment when Accounting for Measured Covariates 


Table 48. Coefficient Summaries for SF4 (Math Gains) 


Coefficients* 


Unstandardized Coefficients Standardized t Sig. 95.0% Confidence Interval for B 
Coefficients 
.291 


ee 
.168) .159 


(Constant) 481 


PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
(Constant) 
PLM11 
PLR11 
GRD6 
GRD7 
WHITE11 
LSES11 
SPED11 
SEX11 
treat 

a. Dependent Variable: MATHGAIN 
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